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Abstract 

Time plays an essential role in the diffusion of in- 
formation, influence and disease over networks. 
In many cases we only observe when a node 
copies information, makes a decision or becomes 
infected - but the connectivity, transmission rates 
between nodes and transmission sources are un- 
known. Inferring the underlying dynamics is 
of outstanding interest since it enables forecast- 
ing, influencing and retarding infections, broadly 
construed. To this end, we model diffusion pro- 
cesses as discrete networks of continuous tempo- 
ral processes occurring at different rates. Given 
cascade data - observed infection times of nodes 
- we infer the edges of the global diffusion net- 
work and estimate the transmission rates of each 
edge that best explain the observed data. The op- 
timization problem is convex. The model nat- 
urally (without heuristics) imposes sparse solu- 
tions and requires no parameter tuning. The 
problem decouples into a collection of indepen- 
dent smaller problems, thus scaling easily to net- 
works on the order of hundreds of thousands of 
nodes. Experiments on real and synthetic data 
show that our algorithm both recovers the edges 
of diffusion networks and accurately estimates 
their transmission rates from cascade data. 



1. Introduction 

Diffusion and propagation processes have received 
increasing attention in a b road range of domains: 
information propagat ion 



dAdar & Adamicl |2005| 



Gomez-Rodriguez et al. , 2010t Meyers & Leskoveq 



2010 ). social networks dKempe et al. il2003t Lapp as et al 



20101) . viral marketing flW atts & Dodds , 
epidemiology ( Wallinga & Teunisl 2004 ). 



2007) and 
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Observing a diffusion process often reduces to noting when 
nodes (people, blogs, etc.) reproduce a piece of informa- 
tion, get infected by a virus, or buy a product. Epidemiolo- 
gists can observe when a person becomes ill but they can- 
not tell who infected her or how many exposures and how 
much time was necessary for the infection to take hold. In 
information propagation, we observe when a blog mentions 
a piece of information. However if, as is often the case, 
the blogger does not link to her source, we do not know 
where she acquired the information or how long it took 
her to post it. Finally, viral marketers can track when cus- 
tomers buy products or subscribe to services, but typically 
cannot observe who influenced customers' decisions, how 
long they took to make up their minds, or when they passed 
recommendations on to other customers. In all these sce- 
narios, we observe where and when but not how or why 
information (be it in the form of a virus, a meme, or a 
decision) propagates through a population of individuals. 
The mechanism underlying the process is hidden. How- 
ever, the mechanism is of outstanding interest in all three 
cases, since understanding diffusion is necessary for stop- 
ping infections, predicting meme propagation, or maximiz- 
ing sales of a product. 

This article presents a method for inferring the mechanisms 
underlying diffusion processes based on observed infec- 
tions. To achieve this aim, we construct a model incor- 
porating some basic assumptions about the spatiotemporal 
structures that generate diffusion processes. The assump- 
tions are as follows. First, diffusion processes occur over 
static (fixed) but unknown networks (directed graphs). Sec- 
ond, infections are binary, i.e., a node is either infected or 
it is not; we do not model partial infections or the partial 
propagation of information. Third, infections along edges 
of the network occur independently of each other. Fourth, 
an infection can occur at different times: the likelihood of 
node a infecting node b at time t is modeled via a proba- 
bility density function depending on a, b and t. Finally, we 
observe all infections occurring in the network during the 
recorded time window. Our aim is to infer the connectiv- 
ity of the network and the likelihood of infections across 
its edges after observing the times at which nodes in the 
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network become infected. 

In more detail, we formulate a generative probabilistic 
model of diffusion that aims to describe realistically how 
infections occur over time in a static network. Finding 
the optimal network and transmission rates maximizing the 
likelihood of an observed set of infection cascades reduces 
to solving a convex program. The convex problem decou- 
ples into many smaller problems, allowing for natural par- 
allelization so that our algorithm scales to networks with 
hundreds of thousands of nodes. We show the effectiveness 
of our method by reconstructing the connectivity and con- 
tinuous temporal dynamics of synthetic and real networks 
using cascade data. 

Related wo rk. The work mos t closely related to 
ours ( Gomez-Rodriguez et al. , 2010t Meyers & Leskovec 
2010h also uses a generati ve probabilistic model fo r infer- 
ring diffusion networks. Gomez-Rodriguez et al. ( 2010l) 
(NetInf) infers network connectivity using submodular 
optimization and iMevers & Leskoved d2010b (ConNIe) 
infer not only the connectivity but also a prior probabil- 
ity of infection for every edge using a convex program and 
some heuristics. However, both papers force the transmis- 
sion rate between all nodes to be fixed - and not inferred. 
In contrast, our model allows transmission at different rates 
across different edges so that we can infer temporally het- 
erogeneous interactions within a network, as found in real- 
world examples. Thus, we can now infer the temporal dy- 
namics of the underlying network. 

The main innovation of this paper is to model diffusion as a 
spatially discrete network of continuous, conditionally in- 
dependent temporal processes occurring at different rates. 
Infection transmission depends on the complex intricacies 
of the underlying mechanisms (e.g., a person's susceptibil- 
ity to viral infections depends on weather, diet, age, stress 
levels, prior exposures to similar pathogens and so on). We 
avoid modeling the mechanisms underlying individual in- 
fections, and instead develop a data-driven approach, suit- 
able for large-scale analyses, that infers the diffusion pro- 
cess using only the visible spatiotemporal traces (cascades) 
it generates. We therefore model diffusion using only time- 
dependent pairwise transmission likelihood between pairs 
of nodes, transmission rates and infection times, but not 
prior probabilities of infection that depend on unknown 
external factors. To the best of our knowledge, continu- 
ous temporal dynamics of diffusion networks has not been 
modeled or inferred in previous work. We believe this is a 
key point for understanding diffusion processes. 

2. Problem formulation 

This paper develops a method for inferring the spatiotem- 
poral dynamics that generate observed infections. In this 



section we formulate our model, starting from the data it 
is designed for, and concluding with a precise statement of 
the network inference problem. 

Data. Observations are recorded on a fixed population of 
N nodes and consist of a set C of cascades {t 1 , . . . , t > c > }. 
Each cascade t c is a record of observed infection times 
within the population during a time interval of length T c . 
A cascade is an TV-dimensional vector t c := (t\, . . . , t c N ) 
recording when nodes are infected, t% £ [0,T C ] U {oo}. 
Symbol oo labels nodes that are not infected during ob- 
servation window [0,T C ] - it does not imply that nodes 
are never infected. The 'clock' is reset to at the start 
of each cascade. Lengthening the observation window T c 
increases the number of observed infections within a cas- 
cade c and results in a more representative sample of the 
underlying dynamics. However, these advantages must be 
weighed against the cost of observing for longer periods. 
For simplicity we assume T c = T for all cascades; the 
results generalize trivially. 

The time-stamps assigned to nodes by a cascade induce the 
structure of a directed acyclic graph (DAG) on the network 
(which is not acyclic in general) by defining node i is a par- 
ent of j if ti < tj . Thus, it is meaningful to refer to parents 
and children within a cascade, but not on the network. The 
DAG structure dramatically simplifies the computational 
complexity of the inference problem. Also, since the un- 
derlying network is inferred from many cascades (each of 
which imposes its own DAG structure), the inferred net- 
work is typically not a DAG. 

Pairwise transmission likelihood. The first step in model- 
ing diffusion dynamics is to consider pairwise interactions. 
We assume that infections can occur at different rates over 
different edges of a network, and aim to infer the transmis- 
sion rates between pairs of nodes in the network. 



Define f(U\tj 



as the conditional likelihood of trans- 



mission between a node j and node i. The transmission 
likelihood depends on the infection times (tj , ti ) and a pair- 
wise transmission rate atj i. A node cannot be infected by a 
node infected later in time. In other words, a node j that has 
been infected at a time tj may infect a node i at a time ti 
only if tj < U. Although in some scenarios it may be possi- 
ble to estimate a non-parametric likelihood empirically, for 
simplicity we consider three well-known parametric mod- 
els: exponential, power-law and Rayleigh (see Table [TJ. 
Transmission rates are denoted as aji > and 6 is the 
minimum allowed time difference in the power-law to have 



a bounded likelihood. As a, 



the likelihood of in- 



fection tends to zero and the expected transmission time 
becomes arbitrarily long. Without loss of generality, we 
consider S = 1 in the power-law model from now on. 

Exponential and power-laws are monotonic models that 
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Table 1. 



Model 




Transmission likelihood 

f(ti\tj-,aj }i ) 


Log survival function 

log S(ti\tj; a j,i) 


Hazard function 

H(ti\tj-,aj,i) 


Exponential (Exp) 


I o 


. e -«3,i(t«-*j) 


if tj < ti 
otherwise 


—aj,i(ti — tj) 




Power law (Pow) 


{? 




if tj + S < U 
otherwise 


a i,i !og ( Vj 


a ^ ■ U-t 3 


Rayleigh (Ray) 


1 o 




if tj < U 
otherwise 


<*«-'*>" 

2 


a j,' ' (ti ~ tj) 



have been previously use d in modeling diffusion net- 
works and social networks ( Gomez-Rodriguez et al. , 2010t 
Meyers & Leskovec , 2010Ik Power-laws model infec- 
tions with long-tails. The Rayleigh model is a non- 
mon otonic parametric model pr eviously used in epidemiol- 
ogy dWallinga & Teunisll2004l) . It is well-adapted to mod- 
eling fads, where infection likelihood rises to a peak and 
then drops extremely rapidly. 



We r ecall some additional standard notation ([Lawless , 
Il982h . The cumulative density function, denoted 
F(ti\tj; ajj), is computed from the transmission likeli- 
hoods. Given that node j was infected at time tj, the sur- 
vival function of edge j — ► i is the probability that node i 
is not infected by node j by time tc 

S(ti\tj-,aj t i) = 1 — F(ti\tj-,ajs). 

The hazard function, or instantaneous infection rate, of 
edge j > z is the ratio 



H(ti\tj;aj t i) 



f(ti\tj;a jt i) 



S(U\tj]aj,i) 

The hazard functions of our models are simple, Table Q] 

Probability of survival given a cascade. We compute the 
probability that a node survives uninfected until time T, 
given that some of its parents are already infected. Con- 
sider a cascade t := (ti, . . . , ijv) and a node i not in- 
fected during the observation window, £j > T. Since each 
infected node k may infect i independently, the probabil- 
ity that nodes 1 ... AT do not infect node i by time T is 
the product of the survival functions of the infected nodes 
1 . . .N\t k < T targeting i, 



[ S(T\t k ;a k ,i). 



(1) 



t k <T 



Likelihood of a cascade. Consider a cascade t := 
(ti, . . . , tjsr). We first compute the likelihood of the ob- 
served infections t- T = (£i, . . . ,£jv|£j < T). Since we 



assume infections are conditionally independent given the 
parents of the infected nodes, the likelihood factorizes over 
nodes as 

/(t^ T ;A)= Y[ f(t i \t 1 ,...,t N \t i ;A). (2) 

Computing the likelihood of a cascade thus reduces to com- 
puting the conditional likelihood of the infection time of 
each node given the rest of the cascade. A s in the indepen- 
dent cascade model dKempe et all 120031) . we assume that 
a node gets infected once the first parent infects the node. 
Given an infected node i, we compute the likelihood of a 
potential parent j to be the first parent by applying Eq. Q] 

f(U\tj;otj^ x S(ti\t k ;a k ,i)- (3) 

&k,t k <u 

We now compute the conditional likelihoods of Eq. |2] 
by summing over the likelihoods of the mutually disjoint 
events that each potential parent is the first parent, 

f(ti\ti,...,t N \U;A)= fi t i\ t f' a j,i) x 

j:tj<ti 

Y[ S(U\t k -a k ,i). (4) 

&k,t k <u 

By Eq.[2]the likelihood of the infections in a cascade is 
/(t^ T ;A)= [] £ /(£,!£,-; a jti )x 

ti<T j-.tjKU 

IJ S(ti\t k ;a k>i ). (5) 

Removing the condition k ^ j makes the product indepen- 
dent of j, 

/(t^ T ;A) = I] II S(ti\t k] a k ^X 

U<T k:t k <ti 

f(ti\tj;aij,i) 



E 

j:t 3 <t, 



S(U\tj]a Jti ) 



(6) 
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Eq.|6]only considers infected nodes. However, the fact that 
some nodes are not infected during the observation window 
is also informative. We therefore add the multiplicative sur- 
vival term from Eq. Q] and also replace the ratios in Eq. |6] 
with hazard functions: 

/(t;A) = 1] II S(T\U;a t , m )x 

U<T t m >T 

IJ S(ti\tk\a k ,i) H (U\tj;a 3il ). (7) 
k;t k <U r-tj<U 

Assuming independent cascades, the likelihood of a set of 
cascades C = {t 1 , . . . , t' c l} is the product of the likeli- 
hoods of the individual cascades given by Eq.[7] 



II /( tC ; A )- 



(8) 



t<=ec 



Network Inference Problem. Our goal is to find the trans- 
mission rates ctj^ of every pair of nodes such that the like- 
lihood of an observed set of cascades C = {t 1 , . . . , t' c '} 
is maximized. Thus, we aim to solve: 



minimize A - E ce c lo S /( t<; ; A ) 

subject to atj i > 0, i,j — 1, . . . , N, i ^ j, 



(9) 



where A := {a.j j \ i,j — 1, . . . , n, i ^ j} are the vari- 
ables. The edges of the network are those pairs of nodes 
with transmission rates a 3 - j > 0. 

3. Proposed algorithm: NetRate 

The solution to Eq.|9]is unique, computable and consistent: 

Theorem 1. Given log-concave survival functions and 
concave hazard functions in the parameter(s) of the pair- 
wise transmission likelihoods, the network inference prob- 
lem defined by equation Eq.]9\is convex in A. 

Proof. By Eq.[8] the log-likelihood of a set of cascades is 

L({t 1 ...t\ c \}-A) = 

*i(t c ; A) + * 2 (t c ; A) + * 3 (t c ; A), (10) 

C 

where for each cascade t c E {t 1 , . . . , t' '}, 
*!(t c ;A)= E W(T\U;ai, m ), 

i:ti<T t m >T 

* 2 (t c ;A)= 2 tosS(ti\tji<*j,i)> 

i:ti<T j:tj<U 

* 3 (t c :A) = Y, log I Y ■ 

i:ti<T \.:< l j 



If all pairwise transmission likelihoods between pairs of 
nodes in the network have log-concave survival functions 
and concave hazard functions in the parameter(s) of the 
pairwise transmission likelihoods, then convexity of Eq. [9] 
follows from linearity, composition rules for concavity, and 
concavity of the logarithm. □ 

Corollary 2. The network inference problem defined by 
equation Eq. \9\ is convex for the exponential, power-law 
and Rayleigh models. 

Theorem 3. The maximum likelihood estimator a given by 
the solution ofEq. \9\is consistent. 

Proof Sketch. We check the criteria for consis- 
tency of identification, c ontinuity and compactness 



dNewev & McFaddenlll994l) . The log-likelihood in Eq.flOl 
is a continuous function of A for any fixed set of cas- 
cades {t 1 . . .t\°\}, and each a defines a unique func- 
tion log/(-|A) on the set of cascades. Finally, note that 
L — > — oo for both a. L j and a,j — > oo for all i, j 
so we lose nothing imposing upper and lower bounds thus 
restricting to a compact subset. 

We refer to our network inference method as NetRate. 

Properties of NetRate. We highlight some common fea- 
tures of the solutions to the network inference problem for 
the exponential, power-law and Rayleigh models. 

All terms in Eq.fTOldepend only on transmission rates ay,, 
and infection time differences (tj — tf), but not absolute 
infection times ti or tj. Our formulation thus does not de- 
pend on the absolute time of the root node of each cascade. 

The "fi and ^ terms contribute a positively weighted 
?i-nor m on vector A that enco urages sparse solu- 
tions ( Boyd & Vandenberghel 2004 ). The penalty arises 
naturally within the probabilistic model and therefore 
heuristic penalty terms to encourage sparsity are not nec- 
essary. Each term of the ^i-norm is linearly (exponen- 
tial model), logarithmically (power-law) or quadratically 
(Rayleigh) weighted by infection times. 

The ^2 term penalizes edges k — > i based on the infec- 
tion times difference ti — tk- Edges transmitting infec- 
tions slowly are heavily penalized and conversely. The 
term penalizes edges i — > j targeting uninfected nodes j 
based on the time T — ti till the observation window cut- 
off. Lengthening the observation window produces harsher 
penalties - however, it also allows further infections. The 
penalties are finite, i.e., if no infection of node j is ob- 
served, we can only say that it has survived until time T. 
There is insufficient evidence to claim j will never be in- 
fected. NetRate does not use empirically ungrounded 
parameters (such as number of edges k and penalty fac- 
tor p used by NetInf and ConNIe respectively) to leap 
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Figure I. Panels (a-c) plot precision against recall; panels (d-f) plot accuracy. For ConNIe and NetInf we sweep over parameters p 
(penalty factor) and k (number of edges) respectively to control the solution sparsity in both algorithms, thereby generating a family of 
inferred models. NetRate has no tunable parameters and therefore yields a unique solution. (a,d): 1,024 node hierarchical Kronecker 
network with exponential model for 5,000 cascades. (b,e): 1,024 node Forest Fire network with power law model for 5,000 cascades. 
(c,f): 1,024 node random Kronecker network with Rayleigh model for 2,000 cascades. 



from not observing an infection to inferring it is impossi- 
ble. Instead, NetRate infers that infections are impossi- 
ble across certain edges, i.e., that some of the optimal rates 
otj^ are 0, based solely on the observed data and the length 
of the time horizon. 

The \&3 term ensures infected nodes have at least one parent 
since otherwise the objective function would be negatively 
unbounded, i.e., log = — oo. Moreover, our formulation 
encourages a natural diminishing property on the number 
of parents of a node - since the logarithm grows slowly, it 
weakly rewards infected nodes for having many parents. 

Optimizing NetRate. We speed up the convex program 
by orders of magnitude via two improvements: 

Distributed optimization: The optimization problem splits 
into N subproblems, one for each node i, in which we 
find TV — 1 rates ctjj, j = 1, . . . , N \ i. The computa- 
tion can be performed in parallel, obtaining local solutions 
that are globally optimal. Importantly, each node's com- 
putation only requires the infection times of other nodes in 
cascades it belongs to. 

Unfeasible rates: If a pair is not in any common 
cascades, atj i only arises in the non-positive term ^3 in 
Eq.[l0l so the optimal atj i is zero. We therefore simply the 
objective function by setting <x,- : .; to zero. 



Solving NETRATE. We solve Eq. with CVX, 
a pac kage for specifying and solving convex pro- 
grams dGrant & BovdlbOK 



4. Experimental evaluation 

We evaluate the performance of NetRate on: (i) synthetic 
networks that mimic the structure of social networks and 
(ii) real cascades extracted from the Meme Tracker datasetQ. 
We show that NetRate discovers more than 95% of the 
edges in synthetic networks and more than 60% in real net- 
works, accurately recovers transmission rates from diffu- 
sion data, and typically outperforms two previously devel- 
oped inference algorithms, NetInf and ConNIe. We use 
the public implementations of NetInf and ConNIe. 

4.1. Experiments on synthetic data 

Experimental setup. We focus on synthetic networks that 
mimic the structure of real-world diffusion networks - in 
particular, social networks. We consider two models of di- 
rected real- world social networks: th e Forest Fire (scale 
free) model ( Barabasi & Albertl 1999 ) and the Kronecker 
Graph model ( Leskovec et al. , 201db to generate diffusion 
networks. We generate three types of Kronecker graph 



Data available at http : / /memetracker . org| 



Uncovering the Temporal Dynamics of Diffusion Networks 



LU 
< 



1 

0.8 
0.6 
0.4 
0.2 




Core-Periphery 
Hierarchical 
Random 
Forest-Fire 



Oil: DO 



EXP POW RAY 



Figure 2. NetR ATE 's normalized mean absolute error (MAE) for 
three types of Kronecker networks (1,024 nodes and 2,048 edges) 
and a Forest Fire network (1,024 edges and 2,422 edges) for 5,000 
cascades. We consider all three models of transmission: exponen- 
tial (EXP), power-law (Pow) and Rayleigh (Ray). 



with very different structures: random (Er dos & Renvi 
1960) (parameter matrix [0.5, 0.5; 0.5, 0.5]), hierarchi 
cal dClauset et all 120081) ([0.9 0.1; 0.1. 0.91) and core 
periphery dLeskovec et al.ll2008b ([0.9, 0.5; 0.5, 0.3]). 



First, we generate network G* by drawing transmission 
rates for edges (j, i) from a uniform distribution. For the 
exponential and Rayleigh models a 6 [0.01, 1] and for the 
power law a G [0.01, 2]. The transmission rate for an edge 
(j, i) models how fast the information spreads from node 
j to node i in social networks. Then, we generate a set of 
cascades over G* . Root nodes of cascades are chosen uni- 
formly at random. As noted previously, the optimization 
problem depends on the time differences (£j — tj). There- 
fore, our formulation does not depend on the absolute time 
of the root node of each cascade. Once a node is infected, 
the transmission likelihoods of outgoing edges determine 
the infection times of its neighbours. We record the time of 
the first infection if a node is infected more than once. In- 
fections are not observed after a pre-specified time horizon 
T. 

Accuracy of NetRate. We evaluate NetRate against 
two other inference methods, NetInf and ConNIe, by 
comparing the inferred and true networks via three mea- 
sures: precision, recall and accuracy. Precision is the frac- 
tion of edges in the inferred network G present in the true 
network G* . Recall is the fraction of edges of the true 
network G* present in the inferred network G. Accuracy 

is 1 - vJH'^il^ 7 ^'-' 1 y . where 1(a) = 1 if a > 
and 1(a) = otherwise. Inferred networks with no edges 
or only false edges have zero accuracy. Second, we evalu- 
ate how accurately NetRate infers transmission rates over 
edges by computing the normalized mean absolute error 
(MAE, i.e.,E[\a*-a\/a*] , where a* is the true transmis- 
sion rate and a is the estimated transmission rate). Note 
that in NetRate, as for real cascades, the probability of 
infection depends on both the transmission rate and the ob- 
servation window. In contrast, ConNIe assigns probability 



priors to edges that are defined without reference to an ob- 
servation window. Therefore, the values assigned to edges 
by NetRate and ConNIe are not comparable, so we do 
not compute MAE for ConNIe. 

Figure Q] compares the precision, recall and accuracy 
of NetRate with NetInf and ConNIe for two types 
of Kronecker networks (hierarchical community struc- 
ture and random) and a Forest Fire network over an 
observation window of length T = 10. In terms 
of precision-recall, NetRate outperforms ConNIe and 
NetInf for all the synthet i c exa mples in the Pareto 



sense (IBoyd & Vandenberghd 120041) . More specifically, if 
we set ConNIe and NetInf' s tunable parameters to pro- 
vide solutions with the same precision as NetRate, Net- 
Rate's recall is always higher than the other two meth- 
ods. Strikingly, ConNIe and NetInf do not achieve Net- 
Rate's recall for any precision value. NetRate outper- 
forms ConNIe with respect to accuracy for any penalty 
factor p in all the synthetic examples. It is also more accu- 
rate than NetInf for most values of k (number of edges). 
Importantly, NetInf and ConNIe yield a curve of solu- 
tions from which have to select a point blindly (or at best 
heuristically), whereas NetRate yields a unique solution 
without any tuning. 

Figure|2]shows the normalized MAE of the estimated trans- 
mission rates for the same networks, computed on 5,000 
cascades. The normalized MAE is under 25% for almost all 
networks and transmission models - surprisingly low given 
we are estimating more than 2,000 non-zero real numbers. 

NetRate performance vs. cascade coverage. Observ- 
ing more cascades leads to higher precision-recall and more 
accurate estimates of the transmission rates. Figure |3(a)| 
plots the MAE of inferred networks against the number 
of observed cascades for a hierarchical Kronecker network 
with all three transmission models. Estimating transmis- 
sion rates is considerably harder than simply discovering 
edges and therefore more cascades are needed for accurate 
estimates. As many as 5,000 cascades are required to ob- 
tain normalized MAE values lower than 20%. 

NetRate performance vs. time horizon. Intuitively, the 
longer the observation window, the more accurately Net- 
Rate is able to infer transmission rates. Figure [3(b)] con- 
firms this intuition by showing the MAE of inferred net- 
works for different time horizons T for a hierarchical Kro- 
necker with exponential, power-law and Rayleigh transmis- 
sion models for 5,000 cascades. 



NetRate running time. Figure [3(c)| plots the average 
running time to infer rates of all incoming edges to a node 
against number of nodes in a network (the number of edges 
is twice the number of nodes) on a single CPU. Further 
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(a) Cascade coverage (b) Time horizon (c) Running time 

Figure 3. Panels (a,b) show NetRate's normalized MAE vs number of cascades and time horizon respectively for a hierarchical Kro- 
necker network with 1,024 nodes and 2,048 edges with exponential (Exp), power-law (Pow) and Rayleigh (Ray) transmission models. 
Panel (c) plots NetRate's average running time to infer rates of all incoming edges to a node against network size (number of nodes) 
for a hierarchical Kronecker network. 



improvements can be achieved since NetRate naturally 
splits into a collection of subproblems, one per node. A 
cluster with 25 CPUs can therefore infer a network with 
16,000 nodes (and 32,000 edges) in less than 4 hours. 

4.2. Experiments on real data 

Dataset description. As in previous work tackling dif- 
fusion networks, we use the MemeTracker dataset, which 
contains more than 172 million news articles and blog posts 
from 1 million online sources. We use hyperlinks between 
blog posts to trace the flow of information. A site publishes 
a piece of information and uses hyper-links to refer to the 
same or closely related pieces of information published by 
other sites. These other sites link to still others and so on. A 
cascade is thus a collection of time-stamped hyperlinks be- 
tween sites (in blog posts) that refer to the same or closely 
related pieces of information. We record one cascade per 
piece - or closely related pieces - of information. We ex- 
tract the top 500 media sites and blogs with the largest num- 
ber of documents, 5,000 hyperlinks and 1 16,234 cascades. 

Accuracy on real data. As the ground truth is unknown 
on real data, we proceed as follows. We create a network 
where there is an edge (u, v) if a post on a site u linked 
to a post on a site v. We consider this as the ground truth 
network G* and we use the hyperlink cascades to infer the 
network G and evaluate how many edges our method esti- 
mates properly. We assume an exponential model. 

Figure|4]compares our results with NetInf and ConNIe. 
As in the synthetic experiments, NetRate yields a unique 
solution whereas the other algorithms produce curves of 
solutions. Panel (a) shows that NetRate performs com- 
parably to NetInf and it outperforms ConNIe on preci- 
sion and recall. Panel (b) plots accuracy: NetRate's out- 
performs the other two algorithms on the majority of their 
outputted solutions, and almost matches their best perfor- 
mances. Since there are no principled methods for choos- 



ing single solutions for NetInf and ConNIe, there is no 
guarantee that the solutions chosen from the curves will be 
anywhere near the highest achievable value. 

5. Conclusions 

We have developed a flexible model, NetRate, of the spa- 
tiotemporal structure underlying diffusion processes. The 
model makes minimal assumptions about the physical, bio- 
logical or cognitive mechanisms responsible for diffusion. 
Instead, it infers transmission rates between nodes of a net- 
work by computing the model that maximizes the likeli- 
hood of the observed data - temporal traces left by cas- 
cades of infections. Qualitative assumptions about infec- 
tions {e.g., are they long-tailed?) determine the choice of 
parametric model on the edges. An interesting feature of 
NetRate, to be investigated in future work, is the possi- 
bility of mixing exponential, power law, Rayleigh or other 
models within a single inference algorithm, thus providing 
tremendous flexibility in fitting real data which may com- 
bine long-tailed, faddish and other qualitative behaviors. 

Remarkably, introducing continuous temporal dy- 
namics, allowing variable transmission rates across 
edges, and avoiding further assumptions dramati- 
cally simplified the problem compared w ith pre- 
vious 



approaches ( Gomez-Rodriguez et al. , 20101 



Meyers & Leskovec , 2010l) . The model has param- 



eters with natural interpretations, and it leads to a 
well-defined convex maximum likelihood problem that 
can be solved efficiently. Importantly, we do not need 
to tune parameters by hand to control the sparsity of the 
inferred network {i.e., number of edges to infer or penalty 
te rms). Heuristic Zj-like pena lty terms, as the ones used 



m 



Meyers & Leskovec ( 2010l) . are unnecessary since the 



probabilistic model naturally imposes sparsity. 

We evaluated NetRate on a wide range of synthetic dif- 
fusion networks with heterogeneous temporal dynamics 
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Figure 4. Real data. Precision-recall and accuracy of NetRate, 
NetInf and ConNIe, with an exponential model, on a 500 node 
hyperlink network with 5,000 edges using hyperlinks cascades. 



which aim to mimic the structure of real-world social and 
information networks. NetRate provides a unique so- 
lution to the network inference problem with high recall, 
precision and accuracy. A direct comparison with the cur- 
rent state of the art is difficult, since these methods in- 
clude a parameter controlling the sparsity of the inferred 
network that requires blind tuning. Nevertheless, Net- 
Rate is typically better in terms of accuracy than previ- 
ous methods across the full range of their tunable parame- 
ters. In addition, it accurately estimates transmission rates, 
which other methods cannot estimate at all. The perfor- 
mance o f ConNIe appears signific antly worse than re- 
ported in Meyers & Leskovec ( 201fj|) : a possible explana- 
tion for the degradation is that in our work, we consider 
networks with heterogeneous temporal dynamics. It is sur- 
prising how well NetInf performs in comparison with 
NetRate despite assuming uniform temporal dynamics 
and priors. 



Finally, we evaluated NetRate on real data. Again, Net- 
Rate provides a unique solution to the network inference 
problem but in this case, as expected, the values of re- 
call, precision and accuracy are modest - adopting a simple 
parametric pairwise transmission model is a simplistic as- 
sumption on real data. In terms of accuracy, it outperforms 
previous methods across a significant part of the full range 
of their tunable parameters. 

NetRate provides a novel view of diffusion processes. 
We believe it can be fruitfully applied to several lines of 
research including influence maximization, control of epi- 
demics, and causal inference. 
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